Data Scaling in OBDA Benchmarks: The VIG Approach

نویسندگان

  • Davide Lanti
  • Guohui Xiao
  • Diego Calvanese
چکیده

In this paper we describe VIG, a data scaler for benchmarks in the context of ontology-based data access (OBDA). Data scaling is a relatively recent approach, proposed in the database community, that allows for quickly scaling up an input data instance to s times its size, while preserving certain applicationspecific characteristics. The advantage of the approach is that the user is not required to manually input the characteristics of the data to be produced, making it particularly suitable for OBDA benchmarks, where the complexity of database schemas might pose a challenge for manual input (e.g., the NPD benchmark contains 70 tables with some containing more than 60 columns). As opposed to a traditional data scaler, VIG includes domain information provided by the OBDA mappings and the ontology in order to produce data. VIG is currently used in the NPD benchmark, but it is not NPD-specific and can be seeded with any data instance. The distinguishing features of VIG are (1) its simple and clear generation strategy; (2) its efficiency, as each value is generated in constant time, without accesses to the disk or to RAM to retrieve previously generated values; (3) and its generality, as the data is exported in CSV files that can be easily imported by any RDBMS system. VIG is a java implementation licensed under Apache 2.0, and its source code is available on GitHub (https://github.com/ontop/vig) in the form of a Maven project. The code is being maintained since two years by the -ontopteam at the Free University of Bozen-Bolzano.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast and Simple Data Scaling for OBDA Benchmarks

In this paper we describe VIG, a data scaler for OBDA benchmarks. Data scaling is a relatively recent approach, proposed in the database community, that allows for quickly scaling an input data instance to n times its size, while preserving certain application-specific characteristics. The advantages of the scaling approach are that the same generator is general, in the sense that it can be re-...

متن کامل

An Evaluation of VIG with the BSBM Benchmark

We present an experimental evaluation of VIG, a data scaler for OBDA benchmarks. Data scaling is a relatively recent approach, proposed in the database community, that allows for scaling an input data instance to s times its size, while preserving certain application-specific characteristics. A data scaler is a “general” generator, in the sense that it can be re-used on different database schem...

متن کامل

A Scalable Benchmark for OBDA Systems: Preliminary Report

In ontology-based data access (OBDA), the aim is to provide a highlevel conceptual view over potentially very large (relational) data sources by means of a mediating ontology. The ontology is connected to the data sources through a declarative specification given in terms of mappings that relate each (class and property) symbol in the ontology to an (SQL) view over the data. Although prototype ...

متن کامل

The NPD Benchmark for OBDA Systems

In Ontology-Based Data Access (OBDA), queries are posed over a high-level conceptual view, and then translated into queries over a potentially very large (usually relational) data source. The ontology is connected to the data sources through a declarative specification given in terms of mappings. Although prototype OBDA systems providing the ability to answer SPARQL queries over the ontology ar...

متن کامل

The NPD Benchmark: Reality Check for OBDA Systems

In the last decades we moved from a world in which an enterprise had one central database—rather small for todays’ standards—to a world in which many different—and big—databases must interact and operate, providing the user an integrated and understandable view of the data. Ontology-Based Data Access (OBDA) is becoming a popular approach to cope with this new scenario. OBDA separates the user f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1607.06343  شماره 

صفحات  -

تاریخ انتشار 2016